Welcome back! In the last post we derived Eigenvectors. In particular, we saw how useful they are in analyzing matrices we need to apply again and again.
In this post, we're going to dive into one of the most famous applications of Eigenvectors - the original PageRank algorithm that allowed Google to create the world's best search engine. PageRank was Larry Page's phD thesis while at Stanford. He spun out this amazing algorithm to redefine search and create one of the most iconic companies in the modern era. We're going to re-derive it here and see that it's just a simple application of Eigenvectors!
By the end of this post, you will:
Feel like you could have come up with PageRank yourself.
Understand how eigenvectors are central to PageRank.
See how central eigenvectors are to dynamic processes in general.
Making Friends
Ok let's build out our understanding of pagerank using a hypothetical scenario.
Let's say you just moved to a new city and you’re trying to make some new friends. You need to figure out which of your new friends to hang out with in your free time. But unfortunately you’re super busy and have very little free time!
So you need to rank all these potential friends to prioritize which ones to chill with.
This could be you! Just need to find the friends...
In a moment of clarity you realize the following:
Friends you’re close with are probably going to hang out with people you’d like.
Someone everyone is friends with is probably someone you are going to be friends with.
You decide to use these two insights to create a scheme you call FriendRank.
FriendRank gives a potential friend a score (FriendScore) based on:
How many people are friends with this person weighted by,
How close you are to the people who are friends with them.
In the above picture, how close you are to a person is shown by the size of the circle. To find your friend score for potential friend FF, you just add up how close you are to AA, CC, and DD.
Let’s write this down as a formula. Let’s say the strength of the friendship between you and some person pp is FS(p)FS(p). Let Friends(p)Friends(p) be the set of friends of pp.
The FS (FriendScore) for a potential friend Fred is:
FS(Fred)=sum_(p in Friends(Fred))FS(p)FS(Fred) = \sum_{p \in Friends(Fred)} FS(p)
Calculating Our First Friend Score
Let’s work through an example. Say you have 4 potential friends:
Jimmy (closeness score - 5)
Maya (closeness score - 9)
Lisa (closeness score - 8)
Fred (?)
In this world, you’re really close to Maya (with a score of 9) and Lisa (8) but only kinda close to Jimmy (5).
Only Jimmy and Maya are friends with Fred.
Graphically, this is:
Only Jimmy and Maya claim to be friends with Fred. Lisa seems to be tearing him apart.
Accounting For People who are Friends With Everyone
Not such a crazy idea right? You start this process and you realize one thing. Your really close friend Maya claims to be friends with EVERYONE. You start wondering, "Hmm maybe I need to downweight Maya’s opinion because she doesn’t really seem to be that picky”.
This is easy - we just divide Maya’s weighting score by how many people she claims to be friends with. That way if Maya claims 100 friends, his weighting for each friend is divided by 100. So our formula is now:
FS(Fred)=sum_(p in Friends(Fred))FS(p)*(1)/(NumFriends(p))FS(Fred) = \sum_{p \in Friends(Fred)} FS(p) \cdot \frac{1}{NumFriends(p)}
Let IsFriends(p,Fred)={[1,"if""p calls Fred a friend"],[0,"otherwise"]:}IsFriends(p, Fred) = \begin{cases}
1 & \text{if}\ \textrm{p calls Fred a friend} \\
0 & \text{otherwise}
\end{cases}
Then we can simplify our earlier formula as shown below:
FS(Fred)=sum_(p in Friends(Fred))FS(p)*(1)/(NumFriends(p))FS(Fred) = \sum_{p \in Friends(Fred)} FS(p) \cdot \frac{1}{NumFriends(p)}
All the people who aren’t friends with Fred just get $0$ terms which gives us the same thing.
So Who's Friends With Whom?
Ok this is looking great! You decide to go ahead and implement this scheme with your friends Bobby, Lauren, Daniel, Rony, and Saba. You start by asking all your friends to write down who they’re friends with! So you send them all the following email:
From: parry.lage@boogle.com
To: Bobby, Lauren, Daniel, Rony, Saba
Subject: Who are your friends?
My dearest "friends",
I’m trying to rank you and all my other potential friends and need your help to do so. Below is a list of people. Please put a 1 next to their name if you call them your friend.
Person
Friends?
Bobby
___
Lauren
___
Daniel
___
Rony
___
Saba
___
Yours truly,
Parry Lage
P.S. Just because I'm sending you this email doesn't mean we're friends.
So Saba sends you something like this:
Person
IsFriends
Bobby
1
Lauren
1
Daniel
0
Rony
0
Saba
0
Looking good so far! You then remember you need to downweight Saba’s score by how many friends she claims to have. She’s claiming 2 friends here so we divide by 2 giving us:
Person
IsFriends
Bobby
1/2
Lauren
1/2
Daniel
0
Rony
0
Saba
0
Sweet.
You now decide to organize all your friends responses into a little table so that it’s easy to see. Each column of the table is a different friend. It looks like this (notice Saba's answers above are the last column in the table below):
Bobby
Lauren
Daniel
Rony
Saba
Bobby
0
1
1/3
0
1/2
Lauren
1/4
0
1/3
0
1/2
Daniel
1/4
0
0
0
0
Rony
1/4
0
0
0
0
Saba
1/4
0
1/4
1
0
This table is just a matrix, which we'll call AA. The element i,ji,j in the matrix is
IsFriends(5,1)IsFriends(5,1) is whether Saba claims to be friends with Bobby. We saw in her response she did, so IsFriends(5,1)=1IsFriends(5,1) = 1.
NumFriends(5)NumFriends(5) is the number of friends Saba has (2). Hence NumFriends(5)=2NumFriends(5) = 2.
So A[1,5]=(1)/(2)A[1, 5] = \frac{1}{2} just as it is in the table.
Notice additionally that IsFriendsIsFriends is not reflexive! For instance, Rony may claim to be friends with Saba, but in the above example, Saba doesn't call Ron a friend.
Hence, IsFriends(Ron,Saba)=1IsFriends(Ron, Saba) = 1, but IsFriends(Saba,Ron)=0IsFriends(Saba, Ron) = 0.
Adding Your Own Scores
You then decide you want to come up with an
initial guess for everyone's FriendScore!
There are some people you know better than others so this is really a guess for what your final FriendScore is for each of them. You write down the following and call it FS_(0)FS_{0} - your initial FriendScore:
Person
FriendScore
Bobby
1
Lauren
1
Daniel
1
Rony
1
Saba
1
Combining Everyone's Opinion
Omg this is a lot more work than I expected.
Ok we now have the following:
AA - a Matrix who's elements are A[i,j]=(IsFriends(j,i))/(NumFriends(j))A[i,j] = \frac{IsFriends(j, i)}{NumFriends(j)}.
FS_(0)FS_0 - An initial guess of everyone's FriendScore.
Now that we know who's friends with who, we can figure out who's really likely to be our friend.
We'd like to come up with a new FriendScore FS_(1)FS_{1} based on this information.
How do we do that?
Recall from earlier our formula for Fred's FriendScore:
Remember the elements of AA are just A[i,j]=(IF(j,i))/(NF(j))A[i,j] = \frac{IF(j, i)}{NF(j)}. Let's look at AA again, substituting the first row of the matrix with the formula representation - we'll see that it's the same as this vector!
So, if we call FS_(1)FS_{1} the result of folding in the results of AA into your initial predictions, we get:
A*FS_(0)=FS_(1)A \cdot FS_{0} = FS_{1}
When Do We Stop?
Ok so we now have a super quick way of finding our next FriendScore. Just take our current guess, which we’ll call FS_(i)FS_i, multiply it by AA, and obtain FS_(i+1)FS_{i+1}:
FS_(i+1)=A*FS_(i)FS_{i+1} = A \cdot FS_{i}
Everytime we compute A*FS_(i)A \cdot FS_{i}, we are updating our guess of our Friend Score with the rest of our friends opinions.
When do we stop this process? We stop when FS_(i+1)FS_{i+1} stops changing:
FS_(i+1)=FS_(i)=FS^(**)FS_{i+1} = FS_{i} = FS^*
At this point, which we call FS^(**)FS^*,
A*FS^(**)=FS^(**)A \cdot FS^* = FS^*
Does this look like the following?
Ax=lambda xAx = \lambda x
Yes!
This is exactly the formula for an eigenvector!
So the final solution FS^(**)FS^*, is an eigenvector of AA!
Seeing This Visually
How does FS_(i)FS_i move to FS^(**)FS^*?
In many ways, the matrix AA pulls FS_(i)FS_i towards its eigenvector FS^(**)FS^*.
There’s an awesome visualization of this here by setosa.io that I highly recommend you check out. Here’s an extract directly from their writeup. In the below image, s1s1 is an eigenvector of AA. Notice how as we keep applying AA on vv, we move closer to s1s1. In particular, the sequence vv, AvAv, A^(2)vA^2v, ... eventually ends up directly on s1s1.
As you keep applying AA on vv, vv gets pulled towards the eigenvector s_(1)s_1.
Similarly in FriendRank, as we keep multiplying FS_(0)FS_{0} by AA, we get the sequence A*FS_(0)A \cdot FS_0, A^(2)*FS_(0)A^2 \cdot FS_0, ... which eventually ends up on FS^(**)FS^* - the eigenvector of AA.
From Friends to Pages
You’ve probably guessed at this point that you can switch the word friends to pages in the above and get PageRank! In the world of webpages:
One webpage claims to be “friends” with another if it links out to the webpage.
Google finds out who’s friends with who (our initial survey) by just crawling the web and finding out who links to who.
The number of "friends" a webpage claims to have is just the number of links present on the webpage.
We create the matrix AA by crawling the web. Once we have it, we find its eigenvector to get an estimate of the PageRanks.
To get PageRank, just replace people with pages.
Putting this together gives us a way of ranking all webpages on the internet for relevance!
I find it really cool how PageRank models the abstract concept of webpages using simple insights into how we interact as humans.
A few quick asides:
The formula we've derived for PageRank is a simplified version of the actual one. To see the full formula, read the original paper here.
Notice that PageRank is actually a next level pun. PageRank was written by Larry Page about ranking webpages - well done Larry.
Conclusion
So we’ve seen that eigenvectors are at the heart of one of the most important algorithms of all time! More generally, whenever you see a linear function being applied again and again and again (Like AA), you are without a doubt going to be looking for the eigenvectors of that linear function/matrix. The eigenvectors will tell you where that process ends!
I hope you enjoyed this unique and powerful application of eigenvectors. In the next post kernels, we're going to step back into Matrix Algebra. We'll learn about Kernels (or Nullspaces) and visually see how they help us immediately grasp what matrices do to their inputs.